SOC2069
Researching
Social Life 1

Quantitative data and descriptive statistics

Dr. Chris Moreh

Outline

  1. Variables
  2. Descriptive statistics

Variables

What is a variable?

  • Statistical methods help us determine the factors that explain variability among subjects/respondents
  • For instance, variation occurs from student to student in their grades. What factors are responsible for that variability?
  • Any characteristic that we can measure for each subject is called a variable
  • Variable are characteristics that can vary in value among subjects in a sample or population
  • Examples of variables are income last year, number of children or siblings, whether employed, gender, how much one likes ice-cream on a scale of 1 to 10, etc.
  • The values the variable can take form the measurement scale
  • For gender, for instance, the measurement scale consists of the two (or more) labels, (female, male, other). For number of children/siblings, it would be (0, 1, 2, 3, 4, …)

Measurement scales

  • A variable is called quantitative when the measurement scale has numerical values that represent different magnitudes of the variable
  • A variable is called categorical when the measurement scale is a set of categories
  • For categorical variables, distinct categories differ in quality, not in numerical magnitude. For this reason, categorical variables are often called qualitative (but we won’t call them as such, to avoid confusion with the type of qualitative data we covered in the first half of the module)

Measurement scales

Measurement scales

The position of ordinal scales on the quantitative–qualitative classification is fuzzy. Because their scale is a set of categories, they are often analyzed using the same methods as nominal scales. But in many respects, ordinal scales more closely resemble interval scales. They possess an important quantitative feature: each level has a greater or smaller magnitude than another level

Measurement scales

A variable’s values are discrete if its possible values form a set of separate numbers, such as (0, 1, 2, 3, . . . ).

They are continuous if it can take an infinite continuum of possible real number values.

Measurement scales

Where do variables come from?

Descriptive statistics

Describing categorical variables

  • Categorical data are characterized by a frequency distribution
  • A frequency table is a listing of possible values for a variable, together with the number of observations (n) at each value
  • When the table shows the proportions or percentages instead of the numbers, it is called a -relative- frequency distribution
  • Frequency distributions can also be visualised with a bar graph

Describing categorical variables

  • Categorical data are characterized by a frequency distribution
  • A frequency table is a listing of possible values for a variable, together with the number of observations (n) at each value
  • When the table shows the proportions or percentages instead of the numbers, it is called a -relative- frequency distribution
  • Frequency distributions can also be visualised with a bar graph

Describing numeric variables

Quantitative variables can be summarised by measures of central tendency and variation (spread)


Central tendency

Describing numeric variables

Quantitative variables can be summarised by measures of central tendency and variation (spread)


Central tendency

Describing numeric variables

Quantitative variables can be summarised by measures of central tendency and variation (spread)


Central tendency

Describing numeric variables

Quantitative variables can be summarised by measures of central tendency and variation (spread)


Central tendency

The mode also applies to categorical variables - it’s more useful for describing the category with the highest frequency

Describing numeric variables

Quantitative variables can be summarised by measures of central tendency and variation (spread)


Variation (spread)

Describing numeric variables

Quantitative variables can be summarised by measures of central tendency and variation (spread)


Variation (spread)

Describing numeric variables

Quantitative variables can be summarised by measures of central tendency and variation (spread)


Variation (spread)

Describing numeric variables

Quantitative variables can be summarised by measures of central tendency and variation (spread)


Variation (spread)

Describing numeric variables

Quantitative variables can be summarised by measures of central tendency and variation (spread)


Variation (spread)

Describing numeric variables

Quantitative variables can be visualised with a histogram (a special frequency distribution with grouped numeric values)

Describing numeric variables

Quantitative variables can be visualised with a histogram (a special frequency distribution with grouped numeric values)


The normal distribution

The normal distribution

The normal distribution

The normal distribution

Skewed distribution

Quartiles and outliers

Boxplot: